--- title: Mask visualization keywords: fastai sidebar: home_sidebar summary: "Visualize defects." description: "Visualize defects." nb_path: "dev_nbs/02_masks.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
{% endraw %}

An example

From an ImageId we get the pixel coordinates or the mask for every defect. Here an example of steel image.

{% raw %}
img_id = "f383950e8.jpg"
tmp_df = train[train['ImageId'] == img_id]
img_path = train_path / img_id
im = Image.open(img_path)

print(tmp_df[["ImageId", "ClassId"]])
im
            ImageId  ClassId
6757  f383950e8.jpg        3
6758  f383950e8.jpg        4
{% endraw %} {% raw %}
height, width = im.shape
height, width
(256, 1600)
{% endraw %} {% raw %}
assert height == 256
assert width == 1600
{% endraw %}

Picking functions

{% raw %}
{% endraw %} {% raw %}

get_random_idx[source]

get_random_idx(n:int)

Return a random sequence of size n.

{% endraw %} {% raw %}
get_random_idx(50)
array([13, 26, 47, 21, 28, 37, 44, 24, 15, 32, 20, 49, 43, 11,  3, 46, 22,
       17, 40,  4, 33,  7, 30,  9, 48, 18, 35,  2, 34, 45, 23, 36, 29, 10,
       16, 19, 38,  8,  1, 31, 39, 12, 41, 42,  5, 14, 27,  0,  6, 25])
{% endraw %} {% raw %}
{% endraw %} {% raw %}

get_perm_imgs_path[source]

get_perm_imgs_path(train_pfiles:L, df:DataFrame)

Return img Path list of selected df.

{% endraw %} {% raw %}
get_perm_imgs_path(train_pfiles, train_all)[:5]
(#5) [Path('../data/train_images/6e30e4696.jpg'),Path('../data/train_images/5f459aeb3.jpg'),Path('../data/train_images/7d8cb6714.jpg'),Path('../data/train_images/99e38d1ee.jpg'),Path('../data/train_images/bbd5080f6.jpg')]
{% endraw %}

RLE functions

{% raw %}
{% endraw %} {% raw %}

rle2mask[source]

rle2mask(rle:str, value:int, shape)

From a RLE encoded pixels returns a mask with value for defected pixels (e.g. value=1 so 1 -> defected, 0 -> groundtruth) and shape as tuple (height, width).

{% endraw %} {% raw %}
{% endraw %} {% raw %}

make_mask[source]

make_mask(item, flatten=False, df=None)

Given an item as:

  • row index [int] or
  • ImageId [str] or
  • file [Path] or
  • query [pd.Series],

returns the image_item and mask with two types of shapes:

  • (256, 1600) if flatten,
  • (256, 1600, 4) if not flatten,

from the dataframe train_pivot if not specified into df.

{% endraw %} {% raw %}
img_name = "f383950e8.jpg"
row_df = train_multi.loc[train_multi["ImageId"] == img_name].iloc[0]
items = [0, img_name, (train_path / img_name), row_df]
{% endraw %} {% raw %}
masks = list(map(make_mask, items))

for imgid, mask in masks:
    test_eq(type(imgid), str)
    test_eq(mask.shape, (256, 1600, 4))
{% endraw %} {% raw %}
func = partial(make_mask, flatten=True)
masks = list(map(func, items))

for imgid, mask in masks:
    test_eq(type(imgid), str)
    test_eq(mask.shape, (256, 1600))
{% endraw %} {% raw %}
row_df
ClassId
ImageId          f383950e8.jpg
ClassId_multi              3 4
Name: 11963, dtype: object
{% endraw %} {% raw %}
_, mask = make_mask(row_df)
_, flatten_mask = make_mask(row_df, flatten=True)

np.unique(mask), np.unique(flatten_mask)
(array([0., 1.], dtype=float32), array([0., 3., 4.]))
{% endraw %} {% raw %}
{% endraw %} {% raw %}

mask2rle[source]

mask2rle(mask)

From https://www.kaggle.com/paulorzp/rle-functions-run-lenght-encode-decode

Attributes: mask: numpy array with 1 -> mask, 0 -> background.

Returns: run length as string formated

{% endraw %} {% raw %}
rle = mask2rle(mask)
rle[:100]
'883001 4 883256 11 883512 17 883767 24 884023 30 884278 114 884533 123 884789 131 885044 137 885300 '
{% endraw %}

Plot defects

{% raw %}
{% endraw %} {% raw %}
fig, ax = plt.subplots(1, 4, figsize=(15, 5))
for i in range(4):
    ax[i].axis('off')
    ax[i].imshow(np.ones((50, 50, 3), dtype=np.uint8) * palet[i])
    ax[i].set_title("class color: {}".format(i+1))
fig.suptitle("each class colors")

plt.show()
{% endraw %} {% raw %}
{% endraw %} {% raw %}

plot_mask_image[source]

plot_mask_image(name:str, img:array, mask:array)

Plot a np.array image and mask with contours.

{% endraw %} {% raw %}
img_name = "f383950e8.jpg"
img = cv2.imread(str(train_path/img_name))
_, mask = make_mask(img_name)
{% endraw %} {% raw %}
plot_mask_image(img_name, img, mask)
{% endraw %} {% raw %}
{% endraw %} {% raw %}

plot_defected_image[source]

plot_defected_image(img_path:Path)

Plot a img_path Path image from the training folder with contours.

{% endraw %} {% raw %}
plot_defected_image(img_path)
{% endraw %}

Exploratory defects

{% raw %}
{% endraw %} {% raw %}

show_defects[source]

show_defects(class_id=None, n=20, only_defects=True, multi_defects=False)

Plot multiple images. Attributes: class_id: [str or int] select a type of defect otherwise plot all kinds; n: select the number of images to plot; only_defects [bool, default True]: if False it shows even the no faulty images; multi_defects [bool, default False]: if True it shows imgs with multi defects.

{% endraw %}

The ClassId = 1 identifies single and multiple defects in the form of rounded spots.

{% raw %}
show_defects(class_id=1, n=20)
{% endraw %}

The ClassId = 2 identifies single and multiple defects of grooves.

{% raw %}
show_defects(class_id=2, n=20)
{% endraw %}

The ClassId = 3 identifies single and multiple defects of scratches.

{% raw %}
show_defects(class_id=3, n=20)
{% endraw %}

The ClassId = 4 identifies single and multiple defects of rolling process.

{% raw %}
show_defects(class_id=4, n=20)
{% endraw %}

To select images with more than one defect:

{% raw %}
show_defects(n=20, multi_defects=True)
{% endraw %}

Labels

To build masks for a segmentation model I followed two approaches:

1) As a first approach, I create the masks for all the images in the training folder. The masks are created with multi_rle_to_mask so with a Shape(256, 1600) and are saved in labels_path. This is useful to plot multi class masks and train with fast.ai DataLoaders.

2) Another approach is to create masks only with defect images and keep a Shape(256, 1600, 4). This is useful to pure Pytorch trainers.

NB: masks must be PNG files and not JPEG because JPEG's compression makes the labels get messed up occasionally (source).

{% raw %}
{% endraw %} {% raw %}

create_masks[source]

create_masks(df:DataFrame)

Create the masks for ImageId in df

{% endraw %} {% raw %}
masks = create_masks(train_multi)
{% endraw %} {% raw %}
ntraining = get_image_files(train_path)
test_eq(len(ntraining), len(masks))
{% endraw %} {% raw %}
idx = get_random_idx(len(masks))
im = np.array(Image.open(masks[idx[0]]))
plt.imshow(im);
{% endraw %} {% raw %}
classes=[0,1,2,3,4]
class_ids = np.unique(im).tolist()
check = [c in classes for c in class_ids]
assert False not in check
{% endraw %}